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study Introduction 



T he Consortium on Chicago School Research 
(CCSR) and the Illinois Business Roundtable 
(IBRT) have historic interests in uses of stu- 
dent assessment information. The two organizations 
came together in fall 1 999 to plan and then conduct a 
survey of representative school districts across the state. 
The survey focused on three dimensions of local as- 
sessment practices within the districts: why districts 
give tests, what tests they give, and what they do with 
their test results? Finally, the survey solicited responses 
from districts about the Illinois Standards Achieve- 
ment Tests (ISAT) program, how it could be improved, 
and how it could better meet districts needs. 

Procedures 

The Consortium drew a random sample of 60 dis- 
tricts across the state, drawing proportionately from 
metropolitan and downstate districts and assuring rep- 
resentation of small, medium, and large- sized districts. 
In addition, CCSR identified the 20 largest school 
districts in Illinois. With the assistance of Research 
Partnerships of Wheaton, Illinois, telephone interviews 
were conducted with district assessment coordinators 
or superintendents. Principal researchers from CCSR 
and IBRT interviewed representatives from the larg- 



est districts and staff from Research Partnerships in- 
terviewed the remaining districts. The interviews lasted 
between 30 and 45 minutes. Seventy-five districts com- 
pleted the assessment survey, providing a fair repre- 
sentation of district testing practices statewide. 

Results 

Purposes of District 
Assessment Programs 

The majority of the survey questions focused on why 
district- wide tests are administered and how the re- 
sults are used. In the very first question, we asked the 
testing administrator or other appropriate staff mem- 
ber to tell us the major purposes of the district testing 
program. Typically, each district mentioned two to four 
different major purposes. Not at all surprising, the most 
common set of responses related to student assessment 
purposes. Nearly 90 percent of districts mentioned 
student assessment purposes as one of the major pur- 
poses of their district-wide testing program. See Fig- 
ure 1 for a display of these results. 

This overall purpose can be best understood by look- 
ing at various different subcategories within this um- 
brella category. Over 70 percent of the districts stated 
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Figure 1 
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that measuring student performance is the major 
purpose of their testing program. The general idea 
expressed by these districts is that test scores provide 
information about the overall level of student perfor- 
mance, much like a thermometer or speedometer. The 
test scores identify overall student strengths and weak- 
nesses, they provide feedback relative to district and/ 
or state goals, they provide external evidence of stu- 
dent improvement and growth, and they give “a gen- 
eral reading of how students are doing in the 
instructional program.” In a closely related category, 
slightly fewer than one-quarter of districts specifi- 
cally mentioned that a major purpose of their test- 
ing program is to compare their students, schools 
and district to national norms as an external check. 

Still within the category of student assessment is a 
separate grouping of responses that refer to using assess- 
ment results to identify or place students in special pro- 
grams or to refer them for particular instruction. These 
programs run the range from special education to gifted 
programs. Identifying low achieving students and stu- 
dents with other difficulties in order to provide them 
with needed services, placing students in appropriate 
classes, and pre-screening for learning disabilities are in- 
cluded here. Almost 30 percent of districts described 
using test scores to identify or place students as a ma- 
jor purpose of their assessment program. 



The second most common purpose 
is curriculum and program evaluation, 
which was mentioned by 56 percent 
of districts. Districts use the results of 
their assessment programs to assess and 
obtain feedback on their curriculum 
needs, to target curriculum areas for im- 
provement, and to adjust and fine-tune 
curriculum sequence and scope. Dis- 
tricts also describe using assessment re- 
sults to provide feedback on instruction 
and to use this information for instruc- 
tional improvements. This process oc- 
curs by providing assessment results in 
terms of areas of strengths and weak- 
nesses. Also, within this category are 
uses related to program evaluation. Dis- 
tricts report using their test results to 
review specific programs and to moni- 
tor their effectiveness. 

The next major purpose of district assessment, cited 
by 24 percent of districts, is for reporting of results to 
parents, the public, and the school board. Districts rely 
on test results to inform parents how well their stu- 
dents are doing academically. They also use the test 
information to inform the broader community about 
the quality of the district’s education program. Report- 
ing test results to the local board of education was also 
mentioned here. In all cases, these uses are related to 
making the district publicly accountable to a variety 
of important stakeholders. 

The fourth major testing purpose can be described as 
planning and goal setting. Seventeen percent of respond- 
ing districts mentioned these activities as an important 
purpose of their assessment program. Several districts 
describe using assessment results for school improvement 
planning and use in continuous quality improvement. 
Other related uses include setting annual goals and then 
reviewing test scores in that context. 

The purposes described above were noted in response 
to an open-ended question. They can be compared to 
responses to a forced-choice question in which the dis- 
trict rated the use of test scores in five areas using a five- 
point scale where 1 represents “not at all” and 5 equals “a 
great extent.” The districts were asked to rate the extent 
to which test score results are used to evaluate district 
programs, school improvement, principals, teachers and 
students (see Figure 2) . 
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In terms of actual usage of 
test score results, district rate 
“evaluating school improve- 
ment” higher than for any of 
the other areas. About 77 per- 
cent of districts chose “to a 
great extent” or the next high- 
est category to describe using 
test score results to evaluate 
school improvement. About 
two-thirds of districts use the 
two highest categories to de- 
scribe their use of test score to 
evaluate district programs. Us- 
ing test scores to evaluate stu- ^ 

dents received similar ratings. 

It is notable that whereas districts offer student as- 
sessment related purposes as the predominant reasons 
they have their testing program, in actual ratings of 
usage they report more use of assessment results for 
evaluating school improvement and district programs 
than students. 

In contrast to evaluating schools, programs and stu- 
dents, few districts use test scores extensively to evalu- 
ate either principals or teachers. In both cases, the 
most frequent response is “not at all.” 



Types of Assessments Administered 

Given the many different purposes that districts have 
for administering assessments, it is no surprise that 



District Use of Test Scores to Evaluate 



Figure 2 



49 % 



43 % 



33 % 



m^5% 

21% m 



27 % i 



: 31 % 



8% 



7 % 



3 % 



.I 



1 23 % 



1 20% 20% 



ji17% 
13 % 



5 % 



io% 



U% 



Students 



Not at all 



Principals 



A great extent 



Teachers 



The next most prevalent type of assessment is tests 
of student aptitude or intelligence. These tests are 
most often used for placing students in gifted or re- 
medial programs. The two most commonly reported 
tests include the Otis-Lennon School Ability Tests 
(OLSAT), which measures cognitive abilities and can 
be used to compare student ability to achievement. 
The OLSAT is designed for use in conjunction with 
the Stanford Achievement Tests. The second ability 
test used in many districts is the Cognitive Abilities 
Test (CogAT). This test is meant to assess students’ 
abilities in reasoning. Because it is published by the 



Figure 3 



districts use a great variety of 
types of assessments. Most of 
these are commercially pro- 
duced “off the shelf" products, 
created for general testing pur- 
poses. By far the largest group 
of these consists of achieve- 
ment tests, including the Iowa 
Tests of Basic Skills (with the 
Tests of Achievement Profi- 
ciency for high school students) , 
the Stanford Achievement Tests, 
the Terra Nova, plus others. 
About 90 percent of districts 
administer standardized 
achievement tests to their stu- 
dents (see Figure 3). 
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same company that sells the Iowa, the two can be used 
together to compare ability and achievement. Approxi- 
mately one- third of districts administer these aptitude 
or IQ tests to their students, though usually only to 
selected grades. 

About 23 percent of districts administer career plan- 
ning and college preparatory instruments to students. 
These are typically given to high school students or 
eighth grade students. The most common of these tests 
is the ACT PLAN, developed and distributed by the 
American College Testing Program. It consists of both 
a set of achievement tests and non-academic sections 
including an interest inventory, and educational and 
occupational plans. Students, parents, and counselors 
use the results for planning post-secondary endeavors 
and for helping with course selection in the final two 
years of high school. The ACT EXPLORE is a similar 
test for eighth grade students who may use the results 
in planning their high school programs. 

About an equal number of school districts have cre- 
ated their own local assessments, aligned with the dis- 
trict curriculum. These are often called CRTs — for 
criterion-referenced tests. Districts use these tests 
for more immediate feedback about student 
progress through the local curriculum. These as- 
sessments are often described as “curriculum em- 
bedded” and provide information specific to the 
district instructional program. 

Finally, 15 percent of districts administer diagnos- 
tic tests, most frequently to students in primary 
grades. The two most used of these tests are the Gates 
MacGinities Reading Test and the Developmental 
Reading Assessment. These tests are administered to 
provide detailed, in-depth information about 
students strengths and weaknesses, with instruc- 
tional implications for improvement. 

More About Achievement Tests: eo 

Grades Tested, Time on Testing, 
and Cost 

40 - 

Nearly every district that responded to this sur- 
vey administers a standardized achievement bat- 
tery in some grades. In elementary grades these 20 - 
tests are most typically administered in reading, 
math, science and social studies, beginning in 
grade two or three (though more than half of 
districts also test first graders), through eighth 



grade. Grades three through eight are the most tested 
grades, with between 90 and 100 percent of districts 
giving assessments in these grades. In high schools, 
on the other hand, about one-half of districts admin- 
ister achievement tests to students in grades nine 
and eleven, with somewhat more testing tenth grad- 
ers. Twelfth grade achievement testing is rare. Most 
districts test in either fall or spring (these two times 
are equally popular) though about 20 percent test 
in winter. A few districts test both fall and spring 
in order to measure growth within the school year. 

The annual testing time required for achievement 
batteries ranges from a low of two hours in districts 
that test only math and reading to a high of six to 
eight hours. Districts with the greatest amount of test- 
ing time assess more subjects, including writing. 

Districts had some difficulty in estimating the 
total cost of their testing programs. The average 
estimate, however, was in the range of $1 1 to $15 
per student 

Strengths of District 
Testing Programs 

The districts noted numerous strengths with their test- 
ing program (see Figure 4). More than half of them 
described strengths in terms of the Quality and In- 
tegrity of the Testing Program. Many attributes con- 
tribute to the overall quality. The most frequent 
comments emphasized the consistency, objectivity, 
fairness, accuracy, and credibility of the testing pro- 
gram. Almost one-quarter of districts used one of these 
specific terms in describing their strengths. They also 
said that the tests had “very solid reputations,” they 

Figure 4 
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were “widely used,” and importandy, ‘‘we have faith 
in it.” 

Another important aspect of the perceived high 
quality of the testing programs is the ability to track 
trends from one year to the next. Nine districts stated 
their ability to compile historical data to examine 
trends contributed to the strength of the testing sys- 
tem. An equal number described the importance of 
national norms, because they show “where students 
are on a national level” and that the “national com- 
parison gives us a broader perspective.” Finally con- 
tributing to the quality of the assessment programs, 
were comments about the attractiveness of the test- 
ing materials, and the support for the program from 
teachers and parents. 

The second largest category of strengths concerns 
how assessment programs help to Identify Student 
Strengths and Weaknesses. Districts made comments 
like, “the tests provide an accurate measure of how 
students are doing,” we “can tell if students need ex- 
tra help” and they identify “student strengths and 
weaknesses to allow us to better meet their needs.” 
Twenty-eight percent of districts used similar language 
to describe strengths of their testing programs. 

The next most prevalent responses is related to Cur- 
ricular Alignment. Twenty-one percent of districts 
made comments about alignment between their test 
and their curriculum, instruction or learning stan- 
dards. Of these, a small number use a standardized 
test that is specifically designed to measure the Illi- 
nois Learning Standards. These districts saw this align- 
ment to state learning standards as a strength of their 
testing program. More generally, respondents com- 
mented that “the tests are as closely aligned to the 
curriculum as possible, its a pretty good match to our 
instructional program,” and “it covers areas that are 
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Figure 5 
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important." These districts acknowledged that while 
their testing program may not be strictly aligned to 
state learning standards they have confidence that they 
are measuring the same important expectations. 

The final major category of strengths, noted by 13 
percent of districts, is their ability to Evaluate Cur- 
riculum and Programs. Testing results help districts 
to “evaluate the strengths and weaknesses of the cur- 
riculum,” they “strengthen curriculum decisions,” and 
they “identify curriculum areas that need addressing.” 
Districts noted several other strengths, including the 
ability to communicate with parents and teachers, and 
that their assessment programs provided them with a 
variety of different measures of students’ progress. 

Weaknesses of District 
Testing Programs 

The most frequendy stated weakness of district assess- 
ment programs, noted by 24 percent of respondents, is 
Lack of Alignment with learning standards, curricu- 
lum, and instruction (see Figure 5). Comments re- 
ported by districts include.* “it doesn’t always match 
what we are teaching and our classroom practices,” 



are able to isolate the relatively few districts that administer locally 
developed tests and analyze their strengths and weaknesses separately from 
other districts. The strengths of these include being curriculum-based and 
aligned so that they meet the needs of their students. The weaknesses include 
not being professionally developed and the difficulties of charting trends and 
disaggregating results by groups of students. 
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Figure 6 

District Ratings of ISAT 




there is not a perfect align- 
ment to curriculum” cind “it 
doesn’t always measure what 
our curriculum teaches. “ Sev- 
eral districts used virtually 
identical phrases: “not tied di- 
rectly to Illinois Learning 
Standards.” One district said 
that the results are “based on 
someone else’s norm group — 
what exactly is that?” 

An equal number of dis- 
tricts — 24 percent — pointed 
to the amount of Time Taken from Instruction as a 
problem in their testing programs. Administering tests 
takes teacher time and student time, with the net ef- 
fect that less time is available for instruction. In a 
related vein, 1 4 percent of respondents mention high 
Cost as a weakness. 

A final category, noted by 18 percent of districts, is 
Results Not Used to Full Potential. These responses 
focused on problems in interpreting results, the need 
for additional training in using test results, the time 
needed to analyze results, and possible misinterpreta- 
tion by non-educators. 

Finally, there were a number of other responses 
that did not fit as neatly into categories. Six dis- 
tricts said that students either do not take the tests 
as seriously as they should or that there is too much 
stress associated with testing. Several districts noted 
the need for more diverse assessments and fewer 
multiple-choice assessments. Other districts noted 
shortcomings that were within their own ability to 
remedy (e.g., time of year tests administered, tests 
too easy for student population). 

Rating the ISAT 

Though most of the questions on the survey dealt with 
district or local assessment programs, the final ques- 
tions focused on the Illinois Standards Achievement 
Testing (ISAT) program. The first set of these asked 
districts to rate the ISAT program using a five-point 
scale ranging from poor (given a value of 1) to excel- 
lent (given a value of 5). Districts rated the ISAT pro- 
gram on how well it is aligned to the Illinois Learning 
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Standards, month administered, speed of reporting 
results, reporting format, and grades tested (see Fig- 
ure 6). Among these five areas, districts gave the high- 
est rating to alignment to learning standards, though 
in absolute terms even this item does not receive a 
very high rating. Fewer than half of the districts use 
category 4 or 5 in rating the alignment of ISAT to 
state learning standards. The ISAT reporting format 
and the grades tested also receive relatively high rat- 
ings, with 41 and 40 percent respectively of districts 
using the two highest categories. The final two items — 
the month that the ISAT is administered (which was 
February 2000) and speed of reporting — both receive 
quite low ratings. Only 21 and 19 percent respectively 
use the two high categories. In both cases, the most 
frequent rating for these two items is “poor,” the low- 
est possible rating. 

Improving the ISAT 

A final, open-ended question on the survey asked dis- 
tricts to describe how the ISAT program could be im- 
proved (see Figure 7). The most frequent response 
made by about one-quarter of districts was the Need 
for Stability and Consistency. Of these comments, 
nearly all used the specific words “consistency” and 
“stability. ” Districts reported that changes in test for- 
mat and grades tested are disruptive to districts and 
make the possibility of tracking trends over time diffi- 
cult, if not impossible. There is a very vocal desire 
among the responding districts for the State Board of 
Education to “make a plan and stick to it. ’’ 
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Figure 7 



An equal number of 
districts (25 percent) 
urged Faster Turn- 
around. It takes “way 
too long” to get test re- 
sults back. Twenty per- 
cent of districts asked to 
Change the Test Date 
to later in the school 
year. However, a few 
districts requested early 
fall testing. 

A number of dis- 
tricts (14 percent) advocated to Increase the Num- 
ber of Grades Tested. One rationale for the 
increased testing is that if adjacent grades are tested, 
then test score gains can be calculated. Gain scores 
provide the foundation for value-added” measures 
of school improvement. Another rationale is that 
with more grades tested, districts might be able to 
reduce their own testing. 

Fourteen percent of districts suggested that ISBE 
De-emphasize Accountability and Emphasize School 
Improvement aspects of the assessment program. They 
expressed some frustrations with the use of test results 
to compare schools to each other, at the expense of 
providing useful information for improvement activi- 
ties. Interestingly in this context, several districts (about 
eight percent) think that ISAT would be improved by 
making the testing system High Stakes for Students. 
They believe the test should be made to pressure stu- 
dents to perform and achieve better. 

Between eight and 1 0 percent of districts suggested 
that ISAT would be improved by making the follow- 
ing changes. Better Score Reporting includes that the 
test results take into account the background of the 
students, that results be reported via computer, that 
additional item analyses are included, and that both 
national and international comparisons be made. For 
Improved Communication, districts requested bet- 
ter coordination with teachers and districts, more train- 
ing on what the test scores mean, and training in test 
score uses for policy makers. More Support to Dis- 
tricts includes greater assistance and more resources 
related to the learning standards and assistance in us- 
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ing test results for curriculum improvement. Several 
districts requested Closer Alignment between ISAT 
and the learning standards, specific links between ques- 
tions and standards, and wider awareness of which 
standards are tested and which are not. Districts also 
requested Improved Technical Quality, including 
greater review of questions in the tests, better reliabil- 
ity, and greater involvement of both educators and 
technical experts. Several districts requested more 
open-ended and performance-based questions, and 
more opportunity for applied learning. On a re- 
lated note, several districts suggested that the state 
turn the testing program over to a major commer- 
cial test publisher. Three districts stated that the 
ISAT was too difficult, that expectations were too 
high, and that the content needed to be “more real- 
istic.” Finally, two districts wanted a better alterna- 
tive for students with disabilities. 

Impact of an Improved ISAT 

The final question in the survey asked districts to 
rate the impact that changes or improvements in 
the ISAT would have on their district testing pro- 
grams. The scale ranged from 1 (not at all) to 5 (a 
great extent). As shown in Figure 8, most districts 
used the middle responses to describe the extent of 
changes they would make in response to improvement 
on the ISAT. There are slightly more responses on the 
positive end of the scale (that is, districts indicating 
they will make changes in their testing programs) than 
at the lower end (no or few changes), however the 
preponderance of responses in the middle suggests 
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Figure 8 

Impact that Improvements to the ISAT 
Would Have on District Testing Program 

40 n 




^ Not at all 2 W: 3 ^4 A great extent 



widespread ambiguity about the effects that changes 
on the ISAT will have on district testing programs. 

Summary Themes 

A few core themes emerge across all of the different 
questions on this survey. The first is the importance 
and value that districts place on perceived quality and 
trustworthiness of tests. On the whole, they are very 
positive about the standardized tests that they purchase 
for their district testing programs, and much less fa- 
vorable about the state testing program. Districts place 
a lot of faith in their own standardized tests and view 
them as highly trustworthy, reliable, and excellent 
sources of very useful information. They used words 



like “quality” and “integrity” in describing these tests. 
Districts were clearly less sanguine about the ISAT. 
Not only are they unhappy about the scheduling of 
the test and the turnaround time for scoring, but they 
comment on the need for greater consistency and 
stability in the state testing program. They would 
like to see the same quality in the ISAT that they see 
in their own standardized tests. 

A second theme relates to the alignment between 
tests, and learning standards, and curriculum. Re- 
sponses here are less straightforward. Though many 
districts would like better alignment between their 
own testing programs and learning standards, many 
are also content with measuring skills and knowl- 
edge that approximate rather than closely align to 
learning standards. Many districts also rate the ISAT 
positively for measuring state learning standards, 
though others suggest that the ISAT could be more 
closely aligned to the Illinois Learning Standards. This 
suggests the usefulness of assessments that provide 
an “external check” on measuring student perfor- 
mance as well as the more closely aligned assessment 
that provide information in relation to specific learn- 
ing standards. 

Finally, in several instances districts expressed will- 
ingness to make students “accountable” through test 
score results, while at the same time wishing to de- 
emphasize school level accountability. Are test results 
improvement tools or accountability tools? In the 
minds of school districts, there is much uncertainty 
on this issue. 



This report reflects the interpretation of its authors. Although the Consortium assisted in the develop- 
ment of this research, no formal endorsement by its Steering Committee members, their organizations, 
or the Consortium should be assumed. 
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